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Abstract - Given a decision process based on the approx- 
imate probability density function returned by a data assim- 
ilation algorithm, an interaction level between the decision 
making level and the data assimilation level is designed to 
incorporate the information held by the decision maker into 
the data assimilation process. Here the information held 
by the decision maker is a loss function at a decision time 
which maps the state space onto real numbers which repre- 
sent the threat associated with different possible outcomes 
or states. The new probability density function obtained will 
address the region of interest, the area in the state space with 
the highest threat, and will provide overall a better approx- 
imation to the true conditional probability density function 
within it. The approximation used for the probability density 
function is a Gaussian mixture and a numerical example is 
presented to illustrate the concept. 

Keywords: Adaptive Gaussian Sum, Decision Making, 
Uncertainty Propagation, Expected Loss. 

1 Introduction 

Chemical, Biological, Radiological, and Nuclear (CBRN) 
incidents are rare events but very consequential, which man- 
dates extensive research and operational efforts in mitigating 
their outcomes. For such critical applications the accuracy 
in predicting the future evolution of toxic plumes in a timely 
fashion represents an important part in the Decision Maker 
(DM) toolbox. Based on these forecasts, decisions can be 
made on evacuating cities, sheltering or medical gear de- 
ployment. Such decisions are taken based on a loss function 
or region of interest such as the population density in a given 
area. 

Many research projects try to model the atmospheric 
transport and diffusion of toxic plumes. While inherently 
stochastic and highly nonlinear, these mathematical models 
are able to capture just a part of the dynamics of the real 
phenomenon and the forward integration yields an uncer- 
tain prediction. The decision maker takes actions based on 
expected loss computed using both the predicted uncertainty 
and the loss function, which here maps a region of interest in 



the state space into a threat level, such as the population den- 
sity in a city. Thus the ability to propagate the uncertainty 
and errors throughout the dynamic system is of great impor- 
tance, and the evolution of the probability density function 
(pdf) has received much attention recently. 

Due to the uncertainty accumulation in integrating the 
model, the forecasts of the system become less and less use- 
ful for the decision maker. Data Assimilation (DA) offers a 
way to reduce the uncertainty by combining measurements 
provided by sensors with model prediction in a Bayesian 
way [14]. This gives an improved situation assessment for 
the hindcast and nowcast. Unfortunately the forecast, used 
to evaluate the impact assessment, is still affected by the ac- 
curacy of probability density function evolution. Even in the 
hindcast and nowcast cases if the sensors provide ambiguous 
measurements such a quadratic measurement model, the im- 
provement brought by DA may be marginal. 

For nonlinear systems, the exact description of the transi- 
tion pdf is provided by a linear partial differential equation 
(pde) known as the Fokker Planck Kolmogorov Equation 
(FPKE) [12]. Analytical solutions exist only for stationary 
pdf and are restricted to a limited class of dynamical systems 
[12]. Thus researchers are looking actively at numerical ap- 
proximations to solve the FPKE [8, 10], generally using the 
variational formulation of the problem. For discrete-time 
dynamical systems, solving for the exact solution, which is 
given by the Chapman-Kolmogorov Equation (CKE), yields 
the same problems as in the continuous-time case. Several 
other techniques exist in the literature to approximate the 
pdf evolution, the most popular being Monte Carlo (MC) 
methods [5], Gaussian closure [7] (or higher order moment 
closure), Equivalent Linearization [13], Stochastic Averag- 
ing [9], Gaussian mixture approximations [1,6, 16]. Fur- 
thermore, all these approaches provide only an approximate 
description of the uncertainty propagation problem by re- 
stricting the solution to a small number of parameters. 

All these assumptions employed make the problem 
tractable and computational efficient, which satisfies the re- 
quirement of minimizing decision latency. But the approx- 
imation given may be of little use when computing the ex- 



pected loss, since the method is not sensitive to the region of 
interest. Such an example may be given using Monte Carlo 
approximations or Gaussian Sum approximations, when the 
propagated uncertainty offers no or very little probabilistic 
support in the region of interest. In other words, no particles 
or Gaussian components are centered in the region of inter- 
est, and even though the probabilistic support may be infi- 
nite, the expected loss computed might be underestimated. 

Methods to deal with such situations have been devel- 
oped, from risk sensitive filters [3] to risk sensitive parti- 
cle filters [17]. The risk sensitive filters minimize the ex- 
pected exponential of estimation error, controlling this way 
how much to weigh the higher-order moments versus the 
lower-order moments. While weighting more the higher- 
order moments, these methods are not designed to be selec- 
tively sensitive to particular regions of interest in the state 
space. The risk sensitive particle filter is able to generate 
more samples in the region of interest, but at the expense of 
biasing the proposal distribution, thus the particles obtained 
are biased towards the region of interest. While this method 
is appropriate for fault detection, it provides a limited out- 
put for the decision maker who is interested in querying the 
probability density function for different numerical quanti- 
ties used in the decision process such as the expected loss, 
the mean or the mode of the pdf. 

The present paper is concerned with providing a better 
approximation to the probability density function by incor- 
porating contextual loss information held by the decision 
maker into the DA process. In this work we use a Gaussian 
mixture approximation to the probability density function. 
We propose a "non-intrusive" way in computing an approx- 
imate pdf that addresses the region of interest and it is closer 
to the true pdf. Non-intrusive refers here to the fact the we 
do not require a new DA method when incorporating the loss 
function into the derivation. 

A progressive selection method is designed to add new 
Gaussian components to the initial Gaussian mixture, in as- 
suring that probabilistic support is reaching the region of 
interest at the decision time. The initial weights of the new 
Gaussian components are set to zero and they are modified 
when propagated throughout the nonlinear dynamical sys- 
tem to minimize the error in the FPKE [16]. Therefore if 
there is any probability density mass in the region of inter- 
est it will be represented by the non-zero weight of the new 
Gaussian components at the decision time. 

The problem is stated in Section |2] and the Gaussian Sum 
approximation to the conditional pdf is presented in Section 
[3] The progressive selection of Gaussian components is de- 
rived in Section [4] An example to illustrate the concept is 
given in Section [5] and the conclusions and future work are 
discussed in Section|6] 

2 Problem Statement 

Consider a general n-dimensional continuous-time noise 
driven nonlinear dynamic system with uncertain initial con- 
ditions and discrete measurement model, given by the equa- 



tions: 

i(t)=f(t,x(t)) + g(t,x(t))r(t) (i) 

z fc =h(t k ,x k ) + \ k (2) 

and a set of k observations, Z k = {zi \ i = 1 . . .k}. 

We denote, x k = x(tfc), r(i) represents a Gaussian white 
noise process with the correlation function QS(t k +i — t k ), 
and the initial state uncertainty is captured by the pdf 
p(to,xa). The random vector \ k denotes the measurement 
noise, which is temporally uncorrelated, zero-mean random 
sequence with known covariance, R^. The process noise 
and the measurement noise are uncorrelated with each other 
and with the initial condition. 

We are interested in finding the conditional probability 
density function p(t, x(t) | Zfc). For t > t k we get the fore- 
cast pdf by integrating only Eq[T| forward, if t = t k we are 
interested in the nowcast or filtered pdf and for t < t k we 
obtain the hindcast or the smoothed pdf. 

Given a state space region of interest at a particular deci- 
sion time, td, which may be represented as a Joss function 
by the decision maker, L(xd, ad), the expected loss of an 
action ad is given by: 



L{a d ) = I L(x d ,a d )p(td,x d \Z k )dx d 



(3) 



Here we will consider only the cases where t d > t k . 
Given approximate computational methods for the condi- 
tional pdf, p(td,Xd\Z k ), we are able to obtain an estimate 
of the expected loss and also find the optimal Bayesian de- 
cision, if a set of decisions exists. 



L(a d ) 



L(xd,ad)p(td,Xd\Z k )dx d 



(4) 
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rgmin / L(x d , a d )p(t d ,x d \Z k )dx d (5) 

a d J 



The decision making process in the data assimilation 
framework is presented in FigQJleft). Obviously if we have 
a good approximation for the conditional pdf in the region of 
interest the same can be said for the expected loss. This sit- 
uation becomes more dramatic when a large deviation exists 
between the actual and the estimated conditional pdf in the 
region of interest. In the case of evaluation of a single de- 
cision, the algorithm can underestimate the actual expected 
loss, L(a,d) -C L(a,d), misguiding the decision maker with 
respect to the magnitude of the situation. In the case when 
a optimal decision has to be chosen, the large difference be- 
tween conditional pdfs may result in picking not only a sub- 
optimal decision but a very consequential one. 

While one can derive a new method to approximate the 
conditional pdf by including the loss function in the deriva- 
tion and reduce the difference in the region of interest to 
better approximate the expected loss, it will accomplish this 
at the expense of worsening the approximation of the condi- 
tional pdf in the rest of the state space. 

This will affect other estimates based on the conditional 
pdf, but the expected loss, that may be required in guiding 
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Figure 1 : Left figure represents the classic approach to de- 
cision making in the data assimilation context. The right 
figure shows the proposed model. 



initial weights will be set to zero, thus the initial uncertainty 
is not changed, but the evolution of the weights is dictated by 
the error in the Fokker-Planck equation. Thus if any proba- 
bility density mass is moving naturally towards the region of 
interest, the weights of the new Gaussian components will 
become greater than zero. Therefore the method will just 
make sure that if there is any probability density mass in the 
region of interest it will be found by the data assimilation 
method. 

In this paper we will consider only the forecast of the con- 
ditional pdf when no measurements are available between 
the current time and the decision time. A suggestion, on 
how this can be used in the case when we have observations 
to assimilate between the current time and the decision time, 
is given in Section|4] 



the decision process, like the mean of the pdf, the modes 
of the pdf, etc. These will be biased towards the region of 
interest, thus misleading the decision maker. 

In other words, if we call the computation of the expected 
loss of a given action as impact assessment and the compu- 
tation of the moments and other quantities based on the con- 
ditional pdf as situation assessment, one will require that 
both to be as accurate as possible. At the limit, if we can 
compute exactly the conditional pdf we obtain both impact 
assessment and situation assessment since we can quantify 
exactly the probability of all the outcomes. 

Since the decision maker holds important information re- 
garding the use of the conditional pdf obtained from the 
data assimilation method, we can incorporate this informa- 
tion in the data assimilation process in a non-intrusive man- 
ner (do not have to derive a new method), by supplement- 
ing the inputs into the data assimilation module. The pro- 
posed method is shown in Fig[TJi"ight), where a new interac- 
tion level is introduced between the decision maker and the 
data assimilation, that uses the contextual information pro- 
vided by the decision maker to supplement the inputs of DA 
/ change the environment in which DA is running. 

Therefore we want to find an approximation to the con- 
ditional pdf, p*(t d ,x d \Z k ), that addresses the interest held 
by the decision maker and provides both a better impact and 
situation assessment ih&n p(t d ,x d \Z k ) . These objectives can 
be captured by the following two relations: 

/ \p{t d ,x d \Z k ) - p* (t d ,x d \Z k )\ dx d < 

/ \p(t d ,x d \Z k ) - p(t d ,x d \Z k )\ dx d (6) 

L*(a d )-L(a d )\<\L(a d )-L(a d )\ (7) 

In the present paper, we will design an interaction level 
between the decision maker and the data assimilation mod- 
ule that approximates the conditional pdf using a Gaussian 
mixture. The interaction level is adding new Gaussian com- 
ponents to the initial uncertainty, such that they will be posi- 
tioned near the region of interest at the decision time. Their 



3 Approximation of the Conditional 
Probability Density Function 

The nonlinear filtering problem has been extensively stud- 
ied and various methods are provided in literature. The Ex- 
tended Kalman Filter (EKF) is historically the first, and still 
the most widely adopted approach to solve the nonlinear 
state estimation problem. It is based on the assumption that 
the nonlinear system dynamics can be accurately modeled 
by a first-order Taylor series expansion [4]. Since the EKF 
provides us only with a rough approximation to the a pos- 
teriori pdf and solving for the exact solution of the condi- 
tional pdf is very expensive, researchers have been looking 
for mathematically convenient approximations. 

In Ref. [1], a weighted sum of Gaussian density functions 
has been proposed to approximate the conditional pdf. The 
probability density function of the initial condition is given 
by the following Gaussian sum, 



p(t ,X ) = ^2woAf(x | /LtQ,Po) 



(8) 



Af(x\fi,P) = |27rP| _1/ exp 



"2 0-M) T P : (x-m) 



Let us assume that the underlying conditional pdf can be 
approximated by a finite sum of Gaussian pdfs 



N 



p(t,x(t) I Zfc) = 5>t|k AW) | /4| fc , Pj, fc ) (9) 



i'u, 



where ph\ k and PL fc represent the conditional mean and co- 
variance of the i th component of the Gaussian pdf with re- 
spect to the k measurements, and w\* k denotes the amplitude 
of i th Gaussian in the mixture. The positivity and normal- 
ization constraint on the mixture pdf, p(t,x\Z k ), leads to 
following constraints on the amplitude vector: 



N 



<\ k > o, 



Vf 



(10) 



A Gaussian Sum Filter [1] may be used to propagate and 
update the conditional pdf. Since all the components of the 
mixture pdf <j9j are Gaussian and thus, only estimates of 
their mean and covariance need to be propagated between 
tk and tk+i using the conventional Extended Kalman Filter 
time update equations: 



Mt|* =t(t,n\\ k ) (11) 

PJ|* = AJ| fe PJ, fe + Pl lk (K\kf + g(t, *4| fc )Qg T (*, ti\k) (12) 

*(M(*)) (13) 



K\k = 



9x(t) 



*(*)=*»ti* 



In Ref. [16] an update method of adapting the weights of 
different Gaussian components during propagation is given 
based on minimizing the error in the Fokker-Planck equa- 
tion for continuous dynamical systems, and on minimizing 
the integral square difference between the true forecast pdf, 
given by the Chapman-Kolmogorov equations and its Gaus- 
sian sum approximation in the discrete time dynamical sys- 
tems. 

The new weights are given by the solution of the follow- 
ing quadratic programing problem: 



1 



w 



/I A- 



argmin -w t , fe (L + I)w t | fc - w^w fe | fe (14) 



Willi I ' 



subject to ljvxi w t|fc = 1 

W t | fc > Ojvxl 

where w t |fc e K. 7Vxl is a vector of Gaussian weights, and 
the elements of L G R NxN are given by: 



By updating the forecast weights, not only can we obtain a 
more accurate estimate but also a better approximation to the 
conditional probability density function [15]. This weight 
update method during uncertainty propagation is very useful 
when the measurement model offer limited or no informa- 
tion in updating the states of the system. 

The estimated conditional pdf is used to compute the ex- 
pected loss. We require that the loss function provided is 
positive, finite everywhere and it is able to distinguish the 
important states from the unimportant ones. For simplicity 
the loss function used in this work has the following form: 



L(x dl a d ) =N(x d | Hl,^l) 



(16) 



Observe that even with a better approximation of the 
weights of different Gaussian components, these compo- 
nents may drift away from the loss function due to first or- 
der Taylor series uncertainty propagation and limited infor- 
mation in the measurement update, situation which may be 
avoided if the conditional pdf can be found in an exact way. 

Due to the approximations used in propagating the con- 
ditional pdf it may happen that no or very little probabil- 
ity density mass exists in the region of interest at the deci- 
sion time, depicted here by the loss function. Thus the ex- 
pected loss will be significantly underestimated, misguiding 
this way the decision maker. 
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(15) 4 Decision Maker - Data Assimilation 
Interaction Level 
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fc=l 



XjXk 



d^(t,x) = ±g(t,x)Qg T (t,x) 

Notice, to carry out this minimization, we need to evalu- 
ate integrals involving Gaussian pdfs over volume V which 
can be computed exactly for polynomial nonlinearity and 
in general can be approximated by the Gaussian quadrature 
method. 

The measurement update is done using Bayes rule, where 
the state and the covariance matrix are updated using the 
Extended Kalman Filter measurement update equations, and 
the weights are updated as it is shown in Ref. [2]. The equa- 
tions can be found in Ref. [15]. 



The iterative method proposed here, is adding a set of 
Gaussian components to the initial pdf that are sensitive to 
the loss function at the decision time. After propagation, 
these Gaussian components will be located near the center of 
support of the loss function at the decision time. Initially the 
weights of these components are set to zero, and they will 
be updated in the propagation step if any probability density 
mass is moving in their direction. The weights at the deci- 
sion time will give their relative contributions in computing 
the expected loss with respect to the entire conditional pdf. 

An algorithm that bears similarity to the simulated an- 
nealing and the progressive correction used in particle fil- 
ters [11], is proposed in selecting the initial Gaussian com- 
ponents. 

The main idea is to select a set of Gaussian components 
initially, propagate each one of them using the time update 
equations in the Extended Kalman Filter until the decision 
time is reached and based on their contributions to the ex- 
pected loss, select their means and variances in the initial 
distribution such that after propagation the expected loss is 
maximized. 

Let the initial pdf be given by p(to,Xo) as a Gaussian sum, 



Eq|S] Compute the mean and the variance of this mixture 

N 

Mo = 5I w oMo 



(17) 



/v 



Po = $>o [Pj + (i4 - Mo)(m - Mof] (18) 

i=l 

Assume that we want to add another M new Gaussian 
components to the initial pdf with zero weights and sen- 
sitive to the loss function. We sample the means of these 
Gaussian components from the initial distribution such that 
their equally weighted sum gives the mean in EqfTTI 



Mj 



p(to,x-o) for i = 1 . . .M — 1 



Hm = M/uo 



Af-l 

»=i 



(19) 



(20) 



The default covariance of the Gaussian components is D. 
We want to find the new covariance D* such that the covari- 
ance of the new Gaussian components matches the covari- 
ance of the initial pdf. Let D* = 7D. Thus we want to find 
7 such that we minimize the following expression: 



J-y = Tr 



7 = 



1 M / 

'"S^ 7D+ ^ ~ Mo)(m* - Mo) T 

1 M 

p ° ~~ aJ X^ Mi - mo)(m* - t*o] 



Tr D 



(21) 



(22) 



Only solutions 7 > are accepted. Otherwise we re- 
peat the sampling of the means, Eq. [19] Once we have 
the initial Gaussian sum components we propagate them us- 
ing the time update equations in the Extended Kalman Filter 
until we reach the decision time. Let fj,\ and P] be their 
means and covariances. The Gaussian components will then 
be weighted based on their contribution to the expected loss. 
A larger contribution means a more sensitive component to 
the loss function, thus a larger weight. 

To be able to compute the weights of the Gaussian com- 
ponents, make sure that all of them are fairly weighted and 
we are not running into numerical problems and also create 
an indicator to mark the end of the algorithm, we compute 
an inflation coefficient for the loss function. Let S^ — ocSl 
be the inflated covariance of the loss function. 

The inflation coefficient a is found such that the expected 
loss computed using the most distant Gaussian component 
from the loss function is maximized. Let the mean and 
the covariance of the most distant component be denoted by 
H™ ax and PT d ax respectively. 
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Ar(x d \» L ,aV L W(x d \»Z a *,T>Z a *)<ix d 



Af(^ L \nZ ax ,^i 
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(23) 



An equivalent way to seek a is by minimizing the nega- 
tive log of the above expectation. 

J mm = log[det(a£z, + P£°")] + 



/ max\T ( vi . nmax \ / max\ /«.\ 

(Ml -Mi* ) ( aT,L + P td J (Ml - Mt d ) (24) 



Let us denote K = aS L + P™ QX and U = (fi L 
Hf d ax ){H L - n? d ax ) T - We seek a > such that 



dJ n 
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Tr 



K-^l-K^VK-^l 



(25) 
(26) 



After a few mathematical manipulations, Eq[26] can be 
written in the following format: 



Tr 



K _1 Si(aI + P^'EZ 1 - UE^KT 1 ^ 



= (27) 



Denote A = K _1 Ei, and B = al + P^S" 1 - US" 1 . 
Observe that for a > the matrix A is symmetric and posi- 
tive definite. 

Lemma: If Tr[ABA] = and A is symmetric and posi- 
tive definite then Tr[B] = 0. 

Proof: Let A = VSV T be a singular value decom- 
position of matrix A, where V is a unitary matrix and S 
is a diagonal matrix. Our trace can now be written as 
Tr[ABA] = Tr[VSV T BVSV T ] = Tr[S 2 B]. 

If Tr[S"B] = then S B is a commutator. Thus there is 
X and Y such that S 2 B = XY YX. But B = S~ 2 XY - 
S~ 2 YX = X*Y - YX*, where X* = S~ 2 X. Therefore B is 
also a commutator, hence Tr[B] = 0. 

Applying the previous lemma to Eqf27]we get 



Tr 



d + prsy'-us; 1 



= 



(28) 



Therefore we accept solutions a > 1 that satisfy the fol- 
lowing relation 



a = — Tr 

n 



max \*sr^— 1 



(u - P£T)2Z 



(29) 



For a < 1 we stop the algorithm. Otherwise we continue 
getting the weights of the Gaussian components by solving 
the following optimization problem: 

w = argmin -w T Mw — w T N (30) 

w 2 

subjectto l^ fxl w = 1 (31) 

w>0 M xi (32) 

where w 6 R Mxl , M G R A/xM and N e M Mxl and their 
entries are given by: 



Hj=M\» J tdl0 



Kdo « p Lio + P td |o 



(33) 



ni=MUjL L /4|o, p L|o + £l] (34) 

The new pdf used to sample the new means is given by: 

M 

Pnew(t ,X a ) = ^A/"(x I m»/3D*) (35) 



Algorithm 1 Progressive Selection of Gaussian Compo- 
nents 

Require: td - decision time 

p(to,Xo) - initial probability density function 

M - number of extra Gaussian components 

D - Gaussian component covariance 

wtoi - add only Gaussian components with weights 

greater than this threshold 

L(x) - loss function W{x|ml, S^} 



Po = p(to, x ), a = oo, 7 = -1 
while (a > 1) & maxitcr do 

The mean and the covariance of the initial pdf 

EN i i 
i=l w f)Mo 

Po = Eti < [Po + (Mo - Mo)(Mo - Mo) T ] 

while (7 < 0) do 

Get the means of the Gaussian components 
Draw fii ~ p(to,xo) for i — 1 . . . M - 
Set jLtM = AIfi 



14: 



Ei=i Mi 



7 = 



-Ti- 



ll E,=i(Mi - Mo) (Mi - Mo) 1 



10: 



11: 



12: 



13: 



•&[D| 

end while 

Get the covariance of the Gaussian components 

P 0=7 D 

Propagate the moments from t = to t = td 

Mt|o= f (^Mt| ) 

p; |0 =AJ| Pi|o+Pj|o(Aj| ) T + gQg T 

Get the most distant component 

by computing the Mahanalobis distance 

di = (ml - mJ„| ) T ( p L|o + Si) (ml - M^io) 

M^| X . P"|o* = argmaxfa) ' 

Compute optimal a and the inflated matrix Y,* L 



a = J-Tr 

n 


((..max ,, T \(,,' m - ax ii t \ t pmaxly-l 
l^t d |0 ^AM td | MtJ ^loj^L 


if a < 1 then a = 1 end if 


Element 


iofMS IRMxM an( j N g |Mxl 



iij =^/*t d | 



AflnL 



MLio j P*<i|o 



r t d |o 



Mt d |0 ' Md|0 + 2j L 

Compute the weights 

w = arg min iw T Mw — w T N 

subject to l^ /xl w = 1 
w > Mx i 
Setpo = E-ii^^{x|M J ,/3P J } 
end while 
Setp NEW (io,x ) =p(t ,x ) + 



'3=1 

18: return pnew(*o,xo 



E ^^-0x^{ X |/iJ,Pj} 



Where /? < 1 is a coefficient that controls the decrease of 
the initial variance. If a has decreased from the previous it- 
eration this means that the Gaussian components are getting 
closer to the loss function and therefore we can decrease the 



variance of the initial distribution to finely tune the position 
of the Gaussian components, otherwise /? = 1. We continue 
to sample new means from the new pdf until a < 1 or the 
maximum number of time steps has been reached. The en- 
tire algorithm is presented in Table Q] While not the scope 
of this paper, the above method can also be applied when 
measurements are available between the current time and the 
decision time. The progressive selection algorithm will be 
applied every time a measurement has been assimilated and 
the a posteriori pdf has been found. The drawback of this 
procedure is that the number of Gaussian components will 
increase linearly with the number of measurements. Better 
ways to deal with the measurement updates are set as future 
work. 

In the case of multiple loss functions, the algorithm is run 
once for each one of the loss functions, creating sets of ini- 
tial Gaussian components sensitive to their loss function. 

5 Numerical Results 

To illustrate the concept of incorporating contextual infor- 
mation into the uncertainty propagation algorithm, we con- 
sider the following continuous-time dynamical system with 
uncertain initial condition given by 



•f'o 



x = sin(a;) + T(t) where 
A/"(-0.3, 0.3 2 ) 



1 



(36) 



The state space region of interest is depicted by the fol- 
lowing loss function, and the time of decision is at td = 
8 sec. 

L(x)=Af(x\~,0.1 2 ) (37) 

First we compute an accurate numerical solution based on 
the FPKE, and this will stand as the true probability density 
function. The performance measures for this method will 
be labeled as TRUTH. The evolution of the pdf using this 
method can be seen in Figf2^. 

Three other approximations for the pdf are provided in- 
cluding the method presented in this paper. The first ap- 
proximation propagates the initial uncertainty using the first 
order Taylor expansion, Eq.dTTb and Eq. (IT2t . also known as 
the Extended Kalman Filter time update equations, labeled 
later as EKF. The evolution of the pdf for this method is 
presented in Figfjp. 

For the next approximation method, we add another 5 
Gaussian components to the initial one, creating this way a 
Gaussian mixture with 6 components. The means of the new 
components are just the result of back propagation (from 
td = 8 sec to t c = sec) of 5 equidistant samples taken in 
the 3 sigma bound of the loss function support. The vari- 
ance of the new components is set to 10~ 10 and their initial 
weights are set to zero. The label used for this method is 
GS_BCK and the evolution of the pdf is shown in Figf2};. 
While all the means of the new Gaussian components are po- 
sitioned in the loss function support region, their variances 
are very being difficult to see the probability density mass in 
that region. 
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(a) TRUTH: Numerical approximation 
FPKE 



(b) EKF: first order Taylor expansion 
approximation 



(c) GSJ3CK: back propagated means 




GS_DEC 

TRUTH 
- LOSS_FUNCTION 
-GS_BCK 
-EKF 




■A..^. 



time (sec) -10 




time step 



(d) GS.DEC: progressive selection of 
Gaussian components 



(e) Probability density function at 
tfe = 8 sec 



(f) The evolution of the pdf used to sample 
the means of the Gaussian components 



Figure 2: The evolution of the Forecast pdf and the Sampling pdf 



We apply the method presented in this paper to gener- 
ate at most 5 new Gaussian components to be added to the 
initial condition. Their means and variances are returned 
by the progressive selection algorithm, Alg.l. The initial 
weights of the new Gaussian components have also been set 
to zero. The default value for the j3 coefficient is 0.9 and 
Gaussian components are included only if their weights are 
greater than wtoi = 10 -3 . The label used for this method is 
GS JDEC and its corresponding pdf is presented in Fig|2]l 

The evolution the Gaussian components for the last two 
methods is also achieved using the first-order Taylor expan- 
sion, but it is interrupted every At = 0.5 sec to adjust the 
weights of different Gaussian components using the opti- 
mization in Eq.([T4l>. 

The following performance measures have been com- 
puted for the methods used in the experiment: 



/Terr 

ISD 
WISD 



/ L(x)p(t d ,Xd)dxd 



\Ld — Ld\ 



(38) 
(39) 



p(td,x d ) -p(td,x d )\ dx d (40) 

/ L(x)\p(t d ,x d ) -p{t d ,x d )\ 2 dx d (41) 



In Figf2^ it is plotted the forecast pdf at time td for all the 
methods. Our method, GS_DEC, is able to better estimate 
the probability density mass in the region of interest. 

In Table I, we present the performance measures after 
500 Monte Carlo runs. The expected loss given by the 
GSJDEC method is consistently better approximated over 
all the Monte Carlo runs than the EKF and the GS_BCK 
method. We also are able to consistently give a better ap- 
proximation to the pdf overall and also in the region of in- 
terest than the EKF method, which justify the use of this 
method. Compared with the GS_BCK we do a better job in 
average in approximating the pdf which suggests that there 
is a trade off in selecting the Gaussian components regarding 
their means and variances. 

In Fig. [2j it is plotted the evolution of the pdf used to 
sample the means of the new Gaussian components for one 
particular run. The pdf used in the first iteration is our initial 
uncertainty and we see how it converges, as the number of 
iterations increases, to a particular region in the state space 
that is sensitive to the loss function at the decision time. 

6 Conclusion 

An interaction level between the decision maker and the 
data assimilation module has been designed, such that we 



Table 1 : Performance measures - 500 Monte Carlo runs 



L„ 



Rp 



ISD WISD 



TRUTH 

EKF 

GS.BCK 

GS_DEC (mean) 



0.0332 N/A N/A N/A 

4.93E-09 1.0000 0.1840 0.0015 

0.0001 0.9968 0.0536 0.0015 

0.0256 0.2300 0.0470 0.0004 



GS_DEC: Percentile Table 
Percent L^ R er r 



500 Observations 
ISD WISD 



0.0% 


0.0010 


0.0151 


0.0368 


0.0002 


5.0% 


0.0142 


0.0230 


0.0378 


0.0003 


10.0% 


0.0177 


0.0271 


0.0380 


0.0003 


25.0% 


0.0229 


0.0566 


0.0387 


0.0003 


50.0% 


0.0257 


0.2270 


0.0491 


0.0003 


75.0% 


0.0313 


0.3090 


0.0514 


0.0004 


90.0% 


0.0323 


0.4670 


0.0574 


0.0006 


95.0% 


0.0324 


0.5710 


0.0601 


0.0007 


100.0% 


0.0327 


0.9700 


0.0705 


0.0014 



can incorporate contextual information held by the decision 
maker into the data assimilation process to better approxi- 
mate the conditional probability function. 

The progressive selection algorithm is run once at the be- 
ginning of the simulation to supplement the initial uncer- 
tainty with new Gaussian components that are sensitive to 
the loss function at the decision time. The weights of all the 
Gaussian components are then updated during the propaga- 
tion based on the Fokker Planck Equation. This way we 
obtain not only a better approximation of the probability 
density function in the region of interest but also a better 
approximation overall. 

The cost of this overall improvement is an increase in 
the number of Gaussian components. The principal benefit 
is not the modest increase in accuracy overall, but the 
significantly enhanced accuracy within the decision maker's 
region of interest. 
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